conditional intensity
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > Canada (0.04)
New User Event Prediction Through the Lens of Causal Inference
Yuchi, Henry Shaowu, Zhu, Shixiang, Dong, Li, Arisoy, Yigit M., Spencer, Matthew C.
Modeling and analysis for event series generated by heterogeneous users of various behavioral patterns are closely involved in our daily lives, including credit card fraud detection, online platform user recommendation, and social network analysis. The most commonly adopted approach to this task is to classify users into behavior-based categories and analyze each of them separately. However, this approach requires extensive data to fully understand user behavior, presenting challenges in modeling newcomers without historical knowledge. In this paper, we propose a novel discrete event prediction framework for new users through the lens of causal inference. Our method offers an unbiased prediction for new users without needing to know their categories. We treat the user event history as the ''treatment'' for future events and the user category as the key confounder. Thus, the prediction problem can be framed as counterfactual outcome estimation, with the new user model trained on an adjusted dataset where each event is re-weighted by its inverse propensity score. We demonstrate the superior performance of the proposed framework with a numerical simulation study and two real-world applications, including Netflix rating prediction and seller contact prediction for customer support at Amazon.
- North America > United States > New York (0.04)
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
- (3 more...)
- Information Technology > Services (0.68)
- Law Enforcement & Public Safety > Fraud (0.54)
- Media > Film (0.48)
- Information Technology > Security & Privacy (0.48)
Non-Parametric Estimation of Multi-dimensional Marked Hawkes Processes
An extension of the Hawkes process, the Marked Hawkes process distinguishes itself by featuring variable jump size across each event, in contrast to the constant jump size observed in a Hawkes process without marks. While extensive literature has been dedicated to the non-parametric estimation of both the linear and non-linear Hawkes process, there remains a significant gap in the literature regarding the marked Hawkes process. In response to this, we propose a methodology for estimating the conditional intensity of the marked Hawkes process. We introduce two distinct models: \textit{Shallow Neural Hawkes with marks}- for Hawkes processes with excitatory kernels and \textit{Neural Network for Non-Linear Hawkes with Marks}- for non-linear Hawkes processes. Both these approaches take the past arrival times and their corresponding marks as the input to obtain the arrival intensity. This approach is entirely non-parametric, preserving the interpretability associated with the marked Hawkes process. To validate the efficacy of our method, we subject the method to synthetic datasets with known ground truth. Additionally, we apply our method to model cryptocurrency order book data, demonstrating its applicability to real-world scenarios.
- Asia > India > Karnataka > Bengaluru (0.04)
- North America > United States > New York (0.04)
Learning Point Processes using Recurrent Graph Network
Dash, Saurabh, She, Xueyuan, Mukhopadhyay, Saibal
We present a novel Recurrent Graph Network (RGN) approach for predicting discrete marked event sequences by learning the underlying complex stochastic process. Using the framework of Point Processes, we interpret a marked discrete event sequence as the superposition of different sequences each of a unique type. The nodes of the Graph Network use LSTM to incorporate past information whereas a Graph Attention Network (GAT Network) introduces strong inductive biases to capture the interaction between these different types of events. By changing the self-attention mechanism from attending over past events to attending over event types, we obtain a reduction in time and space complexity from $\mathcal{O}(N^2)$ (total number of events) to $\mathcal{O}(|\mathcal{Y}|^2)$ (number of event types). Experiments show that the proposed approach improves performance in log-likelihood, prediction and goodness-of-fit tasks with lower time and space complexity compared to state-of-the art Transformer based architectures.
Goodness-of-Fit Test of Mismatched Models for Self-Exciting Processes
Wei, Song, Zhu, Shixiang, Zhang, Minghe, Xie, Yao
We develop a goodness-of-fit (GOF) test for generative models of self-exciting processes by making a new connection to this problem with the classical statistical theory of Quasi-maximum-likelihood estimator (QMLE). We present a non-parametric self-normalizing statistic for the GOF test: the Generalized Score (GS) statistics, and explicitly capture the model misspecification when establishing the asymptotic distribution of the GS statistic. Numerical experiments based on simulation and real-data validate our theory and demonstrate the proposed GS test's good performance.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > Middle East > Iraq (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Asia > Japan (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
A Functional Model for Structure Learning and Parameter Estimation in Continuous Time Bayesian Network: An Application in Identifying Patterns of Multiple Chronic Conditions
Faruqui, Syed Hasib Akhter, Alaeddini, Adel, Wang, Jing, Jaramillo, Carlos A.
Abstract--Bayesian networks are powerful statistical models to study the probabilistic relationships among set random variables with major applications in disease modeling and prediction. Here, we propose a continuous time Bayesian network with conditional dependencies, represented as Poisson regression, to model the impact of exogenous variables on the conditional dependencies of the network. We also propose an adaptive regularization method with an intuitive early stopping feature based on density based clustering for efficient learning of the structure and parameters of the proposed network. Using a dataset of patients with multiple chronic conditions extracted from electronic health records of the Department of Veterans Affairs we compare the performance of the proposed approach with some of the existing methods in the literature for both short-term (one-year ahead) and long-term (multi-year ahead) predictions. The proposed approach provides a sparse intuitive representation of the complex functional relationships between multiple chronic conditions. It also provides the capability of analyzing multiple disease trajectories over time given any combination of prior conditions.
- North America > United States > Texas (0.14)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- (3 more...)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
- Government > Military (1.00)
- Health & Medicine > Health Care Technology > Medical Record (0.86)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
Spatio-Temporal Point Processes with Attention for Traffic Congestion Event Modeling
Zhu, Shixiang, Ding, Ruyi, Zhang, Minghe, Van Hentenryck, Pascal, Xie, Yao
We present a novel framework for modeling traffic congestion events over road networks based on new mutually exciting spatio-temporal point process models with attention mechanisms and neural network embeddings. Using multi-modal data by combining count data from traffic sensors with police reports that report traffic incidents, we aim to capture two types of triggering effect for congestion events. Current traffic congestion at one location may cause future congestion over the road network, and traffic incidents may cause spread traffic congestion. To capture the non-homogeneous temporal dependence of the event on the past, we introduce a novel attention-based mechanism based on neural networks embedding for the point process model. To incorporate the directional spatial dependence induced by the road network, we adapt the "tail-up" model from the context of spatial statistics to the traffic network setting. We demonstrate the superior performance of our approach compared to the state-of-the-art methods for both synthetic and real data.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
A Multi-Channel Neural Graphical Event Model with Negative Evidence
Gao, Tian, Subramanian, Dharmashankar, Shanmugam, Karthikeyan, Bhattacharjya, Debarun, Mattei, Nicholas
Event datasets are sequences of events of various types occurring irregularly over the time-line, and they are increasingly prevalent in numerous domains. Existing work for modeling events using conditional intensities rely on either using some underlying parametric form to capture historical dependencies, or on non-parametric models that focus primarily on tasks such as prediction. We propose a non-parametric deep neural network approach in order to estimate the underlying intensity functions. We use a novel multi-channel RNN that optimally reinforces the negative evidence of no observable events with the introduction of fake event epochs within each consecutive inter-event interval. We evaluate our method against state-of-the-art baselines on model fitting tasks as gauged by log-likelihood. Through experiments on both synthetic and real-world datasets, we find that our proposed approach outperforms existing baselines on most of the datasets studied.
- South America > Argentina (0.16)
- South America > Brazil (0.14)
- South America > Venezuela (0.04)
- (5 more...)
- Health & Medicine (1.00)
- Government (0.68)
Reinforcement Learning of Spatio-Temporal Point Processes
Zhu, Shixiang, Li, Shuang, Xie, Yao
Spatio-temporal event data is ubiquitous in various applications, such as social media, crime events, and electronic health records. Spatio-temporal point processes offer a versatile framework for modeling such event data, as it can jointly capture spatial and temporal dependency. A key question is to estimate the generative model for such point processes, which enables the subsequent machine learning tasks. Existing works mainly focus on parametric models for the conditional intensity function, such as the widely used multi-dimensional Hawkes processes. However, parametric models tend to lack flexibility in tackling real data. On the other hand, non-parametric for spatio-temporal point processes tend to be less interpretable. We introduce a novel and flexible semi-parametric spatial-temporal point processes model, by combining spatial statistical models based on heterogeneous Gaussian mixture diffusion kernels, whose parameters are represented using neural networks. We learn the model using a reinforcement learning framework, where the reward function is defined via the maximum mean discrepancy (MMD) of the empirical processes generated by the model and the real data. Experiments based on real data show the superior performance of our method relative to the state-of-the-art.
- North America > United States > California (0.15)
- North America > United States > Georgia > Fulton County > Atlanta (0.14)
- North America > United States > New York > New York County > New York City (0.14)
- (3 more...)
- Health & Medicine > Health Care Technology > Medical Record (0.54)
- Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)
- Energy (0.46)